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Abstract 

We study several sources of theoretical uncertainty in the determination of parton dis- 
tributions (PDFs) which may affect current PDF sets used for precision physics at the 
Large Hadron Collider, and explain discrepancies between them. We consider in partic- 
ular the use of fixed- flavor versus variable- flavor number renormalization schemes, higher 
twist corrections, and nuclear corrections. We perform our study in the framework of the 
NNPDF2.3 global PDF determination, by quantifying in each case the impact of different 
theoretical assumptions on the output PDFs. We also study in each case the implications 
for benchmark cross sections at the LHC. We find that the impact in a global fit of a 
fixed-flavor number scheme is substantial, the impact of higher twists is negligible, and 
the impact of nuclear corrections is moderate and circumscribed. 
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Precision physics at the LHC requires ever better estimates of parton distribution 
(PDF) uncertainties (see e.g. [1]). At present PDF uncertainties do not include all sources 
of theoretical uncertainty: they only reflect the uncertainty of the underlying data, and 
(possibly) in the procedure used in the PDF determination, but not the effect of the some 
of the necessary theoretical approximations. However, these published PDF uncertainties 
are now rapidly decreasing because of the availability of abundant and precise new data 
from the LHC. Hence, theoretical uncertainties are soon going to become significant and in 
certain cases even dominant. Indeed, this might already be sometimes the case: a recent 
benchmarking of the dependence on PDFs of predictions for several LHC processes [2] 
shows that in some cases predictions obtained using different PDF sets disagree by a 
sizable amount on the scale of PDF uncertainties. It is then natural to ask whether 
some known differences in the theoretical approach used in different PDF extractions may 
explain these differences. 

There are two sources of theoretical uncertainty on which there is currently some, 
albeit partial, knowledge. The first is the dependence on the perturbative order. Since all 
PDF sets [1] are now available at NLO and NNLO (and most also at LO), the uncertainty 
on the NLO results can be determined exactly, and that on the NNLO result can be at 
least in principle estimated from the behavior of the perturbative series [3]. The second is 
the dependence on the matching scheme used to include heavy quark masses. Most PDF 
fitting groups use a so-called general-mass variable- flavor number (GM-VFN) scheme to 
combine fixed order contributions computed with full inclusion of heavy quark masses 
with all-order resummation of contributions due to perturbative evolution in which heavy 
quarks are treated as massless partons. Several ways of doing so used by various PDF 
fitting groups, which differ by subleading terms, have been compared and benchmarked 
in Ref. [1]. However, some PDF fitting groups use a fixed-flavor number (FFN) scheme, 
where only the three lightest flavors and antiflavours are treated as massless partons and 
enter QCD evolution equations, while the contributions of heavy quarks are included in 
partonic cross sections. There are indications [5] that this choice may explain some, or 
perhaps even most, of the differences between PDF sets. Hence this issue deserves further 
investigation. 

There are two further obvious, potentially large, sources of theoretical uncertainty in 
PDF fits. The first is related to the treatment of the medium energy region, where power- 
suppressed (higher- twist) contributions to the Wilson expansion may be relevant, espe- 
cially for deep-inelastic scattering (DIS) data, the kinematic coverage of which sometimes 
extends to relatively low scales, not much above the nucleon mass. While the lowest-scale, 
potentially dangerous DIS data are usually excluded from PDF determinations by suitably 
chosen kinematic cuts, the potential impact of such data (and thus in particular the de- 
pendence on the choice of kinematic cuts) needs to be studied systematically. The second 
source of uncertainty is related to the fact that a sizable fraction of the DIS data (and also 
some Drell-Yan data) are obtained using nuclear targets: deuterium for charged- lepton 
DIS, and heavy nuclei for neutrino DIS. These data are crucial for the separation of light 
flavors, and it has been suggested recently [6l[7j that corrections due to nuclear structure 
may have a significant impact on the extraction of PDFs. 

In this paper we will consider these three, possibly dominant, sources of theoretical 
uncertainties on PDFs: the use of a FFN scheme, the impact of higher twist terms, and 
the impact of nuclear corrections. In each case, we will repeat the NNPDF2.3 NNLO PDF 
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Figure 1: Distances between the default NNPDF2.3 PDFs, and PDFs obtained treating DIS data 
in a FFN scheme. Distances are shown at the scale = 2 GeV^ at which PDFs are parametrized, 
both between central values (left) and uncertainties (right). 

fit [8] by varying the way these effects are treated, and will compare the results both in 
terms of their impact on PDFs and the quality of the fit, and also by checking their impact 
on standard LHC observables. 

FFN Schemes 

We first discuss the impact of the use of the FFN scheme to treat heavy flavor contri- 
butions. In the default NNPDF2.3 fit heavy quark mass effects in deep-inelastic structure 
functions are included using the FONLL-C scheme (see Refs. [8l[9]). In this study we have 
performed a new fit in which all data for DIS structure functions are treated using a FFN 
scheme, while other (hadronic) data are treated in the VFN scheme, with massless heavy 

quarks. In the FFN fit, we take, for the DIS data, ai"-^"^\mc) = 0.3680 at = 2 GeV^, 
which corresponds to Us (Mz) = 0.119, and (Mz) = 0.1061. For the hadronic 

data, we also use ai"^ ^\Mz) = 0.119. The fits are performed at NNLO, using O(a^) 
massive coefficient functions for charm (namely, the same massive charm terms as in the 
FONLL-C scheme). 

The rationale for only treating DIS data in an FFN scheme in our study is that the 
use of a FFN scheme has been advocated |101lll| mostly in conjunction to the inclusion 
of heavy quark mass terms in deep-inelastic heavy quark production. Heavy quark mass 
corrections to inclusive hadronic processes used in PDF determination are usually not 
included (though this could be done also in a VFN scheme [12] using the FONLL method 
used by NNPDF), so nothing is to be learnt by using a FFN scheme in the description of 
these data. 

In Fig. [1] we show the distances between central values and uncertainties of PDFs thus 
determined, and the default NNPDF2.3 PDFs, at the initial scale Ql = 2 GeV^. The 
distance d{x, Q^) between a pair of replica samples for a certain PDF at a given value of x 
and (as defined in Appendix A of Ref. [13]) is basically the difference of the means of 
the two samples, in units of the standard deviation of the mean (distance between central 
values), or the difference of their standard deviations in units of the standard deviation of 
the standard deviation (distance between uncertainties). The definition entails that if we 
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NNPDF2.3 NNLO Global VFN vs. FFN, = 10" GeV^ 
Central Value Uncertainty 
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Figure 2: Same as Fig.[Tl but at = 10'* GeV^. All PDFs have been evolved upwards using the 
same standard (VFN) evolution equations. 

compare two different samples of A'rep replicas, each extracted from the same distribution, 
then on average d = 1, while if the two samples are extracted from two distributions whose 
means differ by one standard deviation, then on average d = y^A^rep, the difference being 
due to the fact that the standard deviation of the mean scales as 1/ y'^V^p . So d ~ 1 
corresponds to statistically equivalent PDFs, while d ~ 10 (with A^rcp = 100 replicas) 
corresponds to statistically inequivalent PDFs which differ by one sigma. 

It is clear from Fig. [T] that the two fits are inequivalent, but differences are moderate, 
at the half sigma level or so, with the change being observed in central values, but with no 
significant change in uncertainties. The PDF which varies most is the gluon: this is easily 
understandable given that the gluon is determined mostly by scaling violations, which are 
different in the FFN case. 

For collider physics applications, the initial PDFs, whether determined in a GM-VFN 
scheme, or in a FFN scheme |10yil|. are evolved upwards using the usual VFN evolution 
equations. It turns out that this evolution amplifies differences between GM-VFN and 
FFN PDFs. The amplification is demonstrated in Fig. [2l where the same distances of 
Fig. [2] are shown, but now at the scale = 10^ GeV^ (relevant e.g. for W , Z or 
Higgs production). While uncertainties are still unchanged, central values for some PDFs 
(specifically the gluon and the quark singlet) now differ by more than one sigma. The 
fact that differences become larger when evolving to higher scales can be understood as a 
consequence of the fact that differences in the large x region (where uncertainties are large) 
at low scale lead upon evolution to high scale to differences in the small x region, where 
uncertainties are relatively small. In Fig. [3] the PDFs that change most with respect to 
the standard NNPDF2.3 ones when adopting a FFN scheme are compared to their default 
counterparts, shown ratio to NNPDF2.3. 

Having ascertained that the impact of choosing a FFN scheme is not negligible, we next 
ask whether a theoretical uncertainty of order of the difference between PDFs extracted 
in the FFN and GM-VFN schemes is being neglected when results are presented in a 
particular scheme. In order to understand if this is the case, we have studied the fit 
quality in the two fits. In Tab. [T] we show the difference between the of the DIS data, 
computed using the FFN or VFN PDF sets both for the total DIS dataset, or for the 



4 



Ratio to NNPDF2.3 NNLO, ttj = 0.119, = lO'GeV" Ratio to NNPDF2.3 NNLO, = 0.119, Q' = iCGeV' 
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Figure 3: Comparison between FFN and default NNPDF2.3 PDFs, displayed as a ratio to the 
default NNPDF2.3 at ^ 10^ GeV^: gluon (top left), quark singlet (top right), up (bottom left) 
and down (bottom right). 



subset of the combined HERA-I data |14j . in various kinematic regions. The so-called 
'experimental' definition of the (to be found in the Appendix of Ref [2]) is used. The 
contribution to the difference from HERA DIS data is shown separately, and in each 
case the number of data points is given. Note that the default NNPDF2.3 cut on the final 
state invariant mass W"^ > 12.5 GeV^ is always imposed. The difference is always positive, 
indicating that the fit quality is always worse in the FFN case than it is with the default 
GM-VFN scheme. The difference in is also positive for all the remaining data in the 
global fit, and is equal to about 20 units for about 750 data points. Hence, the FFN PDFs 
provide a worse fit to the global dataset, and especially a worse fit to DIS data. 

Comparing the fit quality in different kinematic regions, one can see that the dete- 
rioration in fit quality for the FFN structure functions is concentrated in the region of 
large > 100 GeV^ and small x < 0.1 (and thus mostly the HERA data). This may 
perhaps be understood in terms of the so-called 'double-asymptotic scaling' properties of 
the structure function F2 in this region |15] : the rise of the structure function at small x 
has a universal logarithmic slope, driven by perturbative evolution, which depends on the 
number of active flavors, and current HERA data are precise enough to see the change of 
slope when going above b threshold (see Ref. [16] . in particular Fig. 5). In a FFN scheme 
the contribution of heavy flavors to this rise is expanded out to finite order rather than 
being exponentiated to all orders, and this is likely to provide a worse description of this 
double scaling behaviour. 

We conclude that the FFN fit is actually based on a less precise theory, in that it 
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Table 1: Difference Ax' = xIfn — ^vfn °f deep- inelastic data, as described using FFN 
theory with the best-fit PDFs obtained from a a global fit in which PDFs are treated in a FFN 
scheme, and using the default NNPDF2.3 set. Note that the numbers shown are for the absolute 
x', not divided by the number of data points. The contribution from combined HERA-I data is 
also shown in the last column. Results are shown in various kinematic regions, taking into account 
experimental correlations. The number of DIS or HERA-I data points after cuts is also shown in 
each case. The first row corresponds to the default cuts of the NNPDF2.3 fit. 

does not include full resummation of the contribution of heavy quarks to perturbative 
PDF evolution, and thus provides a less accurate description of the data. The difference 
between FFN and GM-VFN PDFs should, therefore, be added as a theoretical uncertainty 
to FFN PDFs, but not to GM-VFN ones, just like the difference between NLO and NNLO 
PDFs is part of the theoretical uncertainty on NLO PDFs, which disappears when going 
to NNLO. 

Higher Twist 

We turn now to the study of the impact on PDF determinations of power-suppressed 
corrections. As is well known, DIS structure function data are affected both by power 
corrections of kinematic origin related to the mass of the target (TMCs, henceforth), as 
well as corrections related to higher twist contributions to the Wilson expansion. The 
former can be determined exactly in the form of an expansion in powers of m'j^/Q'^, with 
niN the nucleon target mass, while the latter are of dynamical origin, and thus if included 
they must be fitted, just like the leading twist PDFs. 

Currently, TMCs are included up to O {m%/Q'^) in the NNPDF2.3 (and in fact in all 
previous NNPDF sets) and in the ABMll PDF determinations, though not in other PDF 
determinations such as MSTW08 [IT], CTIO [H], and HERAPDF1.5 [l9l[20], where they 
are kept under control through suitable kinematic cuts. Dynamical higher twist correc- 
tions, on the other hand, are parametrized and fitted in the ABMll PDF determination, 
but not in any of the other PDF sets, and in particular not in NNPDF2.3, where again 
they are kept under control by imposing a suitable kinematic cut on the invariant mass 
of the final state: NNPDF2.3 removes DIS data for which W'^ < 12.5 GeV^, and only 
includes data with > 3.0 GeV^. Similar cuts are adopted in other global fits: CTIO re- 
moves data with W'^ < 12.25 GeV^ and only accepts data with > 4.0 GeV^; MSTW08 
removes data with W'^ < 15.0 GeV^ and only accepts data with > 2.0 GeV^. HERA- 
PDF1.5 only removes data with < 3.5 as there are no low W'^ data in their dataset, 
while ABMll, who fit higher twist corrections, only remove DIS data with W"^ < 3.24 
GeV^ while including all data with Q"^ > 3.24 GeV^. 

One may nevertheless worry that despite these cuts there might be a residual non- 
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Figure 4: The shape of the higher twist correction (x) Eq. ([T]), for the proton structure 
function F2 , as determined in Ref. [TT]. The correction is also shown rescaled by a factor pht=-1 
or pht=2. 



negligible uncertainty related to the neglect of higher twist corrections. To estimate it, we 
have performed a series of fits in which the leading twist computation of DIS structure 
functions is supplemented by a twist four correction 

Fr{x,Q') = Fl^ix,Q')+PnT^^^ , (1) 

where FY^{x,Q'^) is the leading twist NNPDF2.3 determination of the longitudinal or 

transverse structure function (including target-mass corrections), H^^\x) is a function, 
with dimensions of mass squared, assumed to be independent of (thus neglecting the 
logarithmic scale dependence of higher twist corrections) and to be taken from models 
or from an independent fit, and puT is a constant, to be used to rescale the size of 
the higher twist correction. We have further assumed for H^^\x) the form that was 
obtained in Ref. [TT] along with the ABMll PDF set, and we have varied the parameter 
— 1 < Pht < 2, i.e. we have made it twice as large, or reversed its sign. The shape of 
PutH^^^ (x) is displayed in Fig. HI 

In Fig. [5] we show the distances between the PDFs determined including the higher 
twist correction Eq. ([T]) with pHT = 1 and the default fit. The PDFs are mostly indis- 
tinguishable, the distance being compatible with statistical fluctuations. The only PDF 
which changes by a statistically significant amount is s — s (which is arguably the worst 
determined PDF combination in the context of a global fit), which undergoes a shift by 
about half sigma in the valence region. The other PDFs which are most affected, namely 
the total strangeness, valence and singlet change even less. The PDFs which change most 
are compared in Fig. [6l the changes are barely visible. 

We have then repeated the PDF determination with the extreme choices pht = — 1 
and pht = 2 in Eq. ([1]). Distances are shown in Fig. [71 It is clear that when the sign 
of higher twist corrections is reversed, their effect remains negligible, and even when they 
are arbitrarily doubled, PDFs always change by less than half sigma, and mostly much 
less than that. This fit is performed with the same default cut W^> 12.5 GeV^ adopted 
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NNPDF2.3 NNLO Global Ref vs. HT with p =1 , = 2 GeV^ 
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Figure 5: Distances between PDFs determined including the higher twist correction shown in 
Fig. Hwith pht = 1 in Eq. ©, and the default NNPDF2.3 set. 



in NNPDF PDF determinations: it appears that with this cut, the impact of including 
higher twist corrections to DIS structure functions is negligible. The same conclusion is 
very likely to apply to the MSTW08 and CTIO PDF determinations, which adopt similar 
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Figure 6: Comparison of PDFs which are mostly affected by higher twist corrections, Eq. ([T]) with 
Put = 1, and the default (distances were shown in Fig. [5]): singlet (top left), total valence quark 
(top right), strange s + s, and s — s (bottom right), all shown at = 2 GeV. 

In Table [2] we show the for the three fits including higher twist corrections, both 
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Table 2: The of the global fit before and after the inclusion of higher twist corrections, for the 
three scenarios shown in Fig. 2] We also provide the number of data points for each experiment. 
Note that this is the absolute x^, not divided by the number of data points. 

for the global dataset, and for each experiment. It appears that when pht = 1 the global 
is essentially unchanged, with the improvement in the desription of the SLAC data 
compensated by the deterioration of the fit to BCDMS data. With the other two choices, 
PWT = —1 or pht = 2, the fit quality deteriorates substantially, as one might expect. 

As a final consistency check, we have repeated the standard NNPDF2.3 PDF determi- 
nation with no higher twist corrections, but with a stricter cut W"^ > 20 GeV^. With this 
cut, any possible residual effect of higher twists in the default fit would be greatly reduced, 
and thus the variation of results is an indication of their possible presence in the default 
fit. The distances between this PDF set and the default are shown in Fig. [HI Again, they 
are barely above the level of statistical fluctuations — because the more stringent cut 
changes the dataset, full statistical equivalence is not expected, but changes are below, 
usually much below, the half sigma level, and thus not statistically signiflcant. We con- 
clude that higher twist corrections and associated uncertainties are negligible in current 
global NNPDF sets, and so is their impact on the quality of the global fit. Similar results 
were found in a related MSTW analysis |21j . 

Nuclear Corrections 

Finally, we discuss the impact of nuclear corrections on PDF determinations. In the 
NNPDF2.3 fit (and in other global fits such as MSTW08 and CTIO) three classes of data 
which may be affected by nuclear corrections are used: neutrino DIS data, which are 
obtained on heavy, approximately isoscalar, nuclear targets (such as iron); fixed-target 
data for DIS on deuterium, and fixed-target Drell-Yan data on deuterium. The impact 
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NNPDF2.3 NNLO Global Ref vs. HT with p = -1 , = 2 GeV^ 
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NNPDF2.3 NNLO Global Ref vs. HT with p = 2, = 2 GeV^ 
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Figure 7: Same as Fig. [5l but with pht = —1 (top row) and pht — —2 (bottom row). 
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Figure 8: Distances between the reference NNPDF2.3 PDFs, with W'^ > 12.5 GeV^, and PDFs 
obtained imposing on the dataset the tighter constraint > 20 GeV^. 

of nuclear corrections on neutrino DIS data was studied in Ref. [22], and found to be 
negligible in comparison to the sizable uncertainties on these data. Nuclear corrections to 
deuterium are rather smaller than those for heavy nuclei, but structure function data on 
deuterium targets can be quite precise, so here they could have an impact, especially on 
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the determination of the up-down quark ratio at large x. 

The possible impact of deuterium nuclear corrections was recently emphasized in 
Ref. [7] where, relying on previous studies the CJ12 PDF sets were presented, 

based on CTEQ methodology but including nuclear corrections to deuterium structure 
function data derived using a variety of models. The impact of deuterium corrections was 
also recently studied in Rcf. [23], where they were fitted to the data. 

Deuteron Nuclear Correction c(x) 
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X 

Figure 9: The nuclear correction factor Eq. ^ from the fit of Ref. [24| (labeled MMSTWW) and 
from the three models of Ref. [7] (labeled CJmin, CJmid and CJmax). The MMSTWW correction 
factor is independent, while the CJ models are shown for = fOO GeV^. 

We have studied the impact of deuterium corrections on the NNPDF2.3 PDF deter- 
mination by correcting all deuterium structure function data according to 

Fiix, Q2) = c(x) (Ff (x, Q2) + F^ix, Q2)) /2 . (2) 

For the correction factor c(x) we have first adopted the phenomenological determination 
obtained in Ref. |24] . This is Q^-independent and such that the correction is negative 
below a given value of x and positive above it, and is parametrized by three parameters 
determined through a global PDF fit based on the MSTW08 methodology. We have also 
computed c{x) for the three choices considered in Ref. [7] (CJmin, CJmid and CJmax) 
using the expressions of F^{x,Q'^), F|(x,g2) and F^{x,Q'^) provided by the authors. 

Eq. ([2]) should be viewed as a iC-factor approximation, because in the nuclear models 
used in Ref. [7] the nuclear correction is not just multiplicative, but rather it is a Q^- 
dependent correction which depends on the structure function itself, partly in a convolutive 
way. This approximation is adequate for our current goal, which is to determine the size of 
these corrections and their associated uncertainties, rather than their shape. Even though 
the correction of Ref. [7] is scale dependent, we have evaluated it at = 100 GeV^ and 
we have assumed it to be Q^-independent, as the dependence is weak in the region 
X < 0.5 |25| where, as we shall see, the impact of the correction is significant. The shape 
of the nuclear correction Eq. ([2]) in all these cases is displayed in Fig. [9l 

We have then repeated the NNPDF2.3 PDF determination including the correction 
according to each of these four models in turn. Note that only deuterium structure function 
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Figure 10: Distances between the reference NNPDF2.3 PDFs, and PDFs obtained by introducing 
nuclear corrections to deuterium structure function data according to Eq. ([2|) , with the four nuclear 
correction factors shown in Fig. [9l Only distances between central values are shown. 



data are corrected, so in particular Drell-Yan data on fixed deuterium targets remain 
uncorrected: indeed, Refs. [71I21] only consider deuterium corrections to DIS structure 
functions. These data, however, are mostly in a kinematic region where nuclear corrections 
are very small. The distances between PDFs obtained including nuclear corrections in this 
way, and the default NNPDF2.3 PDFs, are displayed in Fig. [101 We only show distances 
between central values, as we have verified that uncertainties are unaffected. The only PDF 
combination which is significantly affected by the introduction of deuterium corrections is 
the isospin triplet, which changes by more than one sigma for 0.1 < x < 0.5 and up to one 
and a half sigma at the valence peak x ~ 0.3, for the intermediate MMSTWW and CJmid 
cases. Changes of up to half sigma at the peak are seen also in the valence and As = d — u 
combinations, though these, as well as all other PDFs, mostly display changes which are 
compatible with statistical fluctuations. In the extreme CJmax case, the variation can be 
up to three sigma at the peak for the isospin triplet, and up to one sigma for the singlet 
and A<j. A comparison between the default fit, and that with nuclear corrections included 
following Ref. |24] . is presented in Fig. [11] for the two PDF combinations which change 
most, namely the triplet and singlet. 

In Ref. [7] it was argued that the quality of the global fit is essentially unaffected 
by these nuclear corrections, which are absorbed in a change of the PDFs, but only if 
they are not too large, and thus in particular it was argued that deuterium corrections 
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Figure 11: Comparison of PDFs which are mostly affected by deuterium nuclear corrections, 
using the MMSTWW model, and the default (distances were shown in Fig. [10]): isotriplet (left) 
and sing let (right), aU shown at = 2 GcV^. 

as large as CJmax are disfavored by the data. In Ref. [23] a moderate but non-negligble 
improvement in fit quality was found when the nuclear corrections are added. In our case, 
we find that the fit quality is essentially unaffected by the inclusion of nuclear correction, 
unless they are too large, in which case the fit quality deteriorates significantly. In Table [3] 
we show the ^ for the global fit and for each dataset included in it, both for the default 
fit and for the fits with deuterium corrections. The fit quality deteriorates somewhat upon 
inclusion of nuclear corrections, by an amount per datapoint (A^^ ~ 0.01) which (with 
the MMSTWW form of the correction) is about half the improvement seen in Ref. [23]. It 
is interesting to observe that most of this deterioration comes from the CHORUS neutrino 
deep-inelastic scattering data, which are obtained using heavy nuclear targets. These data 
are corrected for nuclear effects in Ref. [24] , but not in our study, which might explain the 
difference. When CJmin and CJmid corrections are applied, the fit quality deteriorates 
by a smilar or somewhat larger amount (also mostly due to CHORUS data), while it 
deteriorates significantly if CJmax corrections are used, in agreement with the findings of 
Ref. [7]. 

The role of nuclear corrections in determining the down /up ratio in the x — t- 1 limit has 
been especially emphasized in Refs. [6l[71[23]. In order to elucidate this point, in Fig. [T2] 
we look at the d(x)/u(x) ratio at = 2 GeV^: we show the NNPDF2.3 result with 
uncertainty, and we superimpose on it the central value of the ratio obtained in fits where 
different types of deuterium nuclear corrections are included (left plot). For comparison, 
we also show the effect on the central value of the ratio of only fitting to DIS data, of 
using a FFN scheme, and of including higher twists according to Eq. ([T|) with the default 
choice pht = 1 (right plot). PDF uncertainties are computed as 68% CL, since at large- 
X uncertainties show non-gaussian behaviour. It is clear that the impact of deuterium 
corrections is visible for x < 0.5, as already seen in the distances plots of Fig. [TOl but 
it is completely negligible in comparison to the uncertainty for larger x values. Similarly 
negligible are the impact of higher twist corrections, and even of the use of a FFN scheme. 
By contrast, what does have a significant impact, up to very large x ~ 0.8 is the use of 
DIS only data in the PDF determination. 

We can therefore conclude that the use of a global dataset, including in particular a 
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Table 3: The of the global fit before and after the inclusion of deuterium nuclear correc- 
tions, with the four models shown in Fig. O We also provide the number of data points for each 
experiment. Note that this is the absolute x^, not divided by the number of data points. 
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Figure 12: The dju ratio at = 2 GcV^ as a function of x. The central value obtained with 
several variants of the fit is compared to the default NNPDF2.3 result and uncertainty. Left: effect 
of different models of deuterium corrections shown in Fig. HI Right: effect of using a FFN scheme, 
of using DIS-only data, and of including the higher twist correction shown in Fig. |4] with the central 
choice pht = 1- PDF uncertainties are computed as 68% confidence levels. 

wide variety of hadronic data, is crucial in order to have a handle on the large x flavor 
separation. On the other hand for x > 0.5, the light flavor separation is affected by a 
large uncertainty due to the scarcity of experimental information. Theoretical uncertain- 
ties related to the effects which we study here, and in particular nuclear corrections are 
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Figure 13: Some LHC standard candles computed at NNLO using various PDF sets discussed in 
this paper: production (top) and tt production (bottom) ^ = 8 TeV. All results are shown 
for asiMz) = 0.119. 

completely negligible on the scale of these uncertainties. A somewhat different conclusion 
was reached in Ref. , where it was argued that nuclear corrections affect signficantly the 
d{x)/u{x) ratio in the x — )• 1 limit. In this reference, a wider dataset was used, includ- 
ing low and low W"^ data, while higher twist and nuclear corrections were introduced. 
Whereas inclusion of such data might raise somewhat the value of x at which uncertainties 
start blowing up, it appears that when using a the more general NNPDF parametrization, 
rather than the more restrictive one used in Ref. [7], the uncertainties on the d{x)/u{x) 
ratio in the x — )• 1 limit are necessarily so large that nuclear corrections are unlikely to 
play a significant role. 

We conclude that the impact of deuterium nuclear corrections is non-negligible in the 
region 0.1 <x < 0.5 on the isospin triplet combination, i.e. on the up-down separation. 
However, the theoretical status of these corrections is not entirely satisfactory: a variety 
of models is available, but they are not clearly favored by the data, and may become 
disfavored if the correction is large. Given the uncertainties involved, the inclusion of such 
corrections is not clearly advantageous at present. However, it should be kept in mind 
that the uncertainty on the isotriplet should be supplemented by a theoretical uncertainty 
related to nuclear corrections in the 0.1 < x < 0.5 region. 

Conclusions 

We summarize our results by looking at the impact of the three different sources of 
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theoretical uncertainties considered here on the predictions for a representative set of LHC 
standard candles, namely the total cross sections for production, which is sensitive 
to quark and antiquark distributions, and top production, which is sensitive to the gluon 
distribution at large Bjorken-x. We have computed these processes at NNLO, using the 
default NNPDF2.3 PDFs and various other sets discussed in this paper, as well as with 
other PDF sets which we have referred to in the course of the present discussion, namely 
ABMll [n], CTIO [261127] and MSTW08 [l7], all with a common value of as{M^) = 0.119. 
The codes and settings used to compute the various cross sections are the same as in the 
recent benchmark study Ref. [2]; for top production we have used the more recent version 
2.0 of the Top++ code [28] (the NNPDF2.3 and ABMll results shown are thus the same 
as in Ref. [29] ) . Higher twist corrections are shown using the pht = 1 curve from Fig. HJ 
and deuteron corrections are shown using the MMSTWW curve from Fig. [9l 

These cross-sections are shown in Fig. [131 R is clear that the impact of higher twist 
and nuclear corrections is negligible, both compared to PDF uncertainties and to the 
differences between different sets. Based on our previous discussion, it is likely that for 
higher twist corrections this will always be the case, while nuclear corrections might have 
a visible impact, up to the half sigma level on sufficiently exclusive observables which are 
sensitive to the up-down difference at large x (such as, for example, the charge asymmetry 
for very high mass virtual production, or, more interestingly, heavy new particles 
with flavour-dependent couplings). On the other hand, the use of a FFN scheme has a 
visible impact, comparable to that of using a smaller dataset which only includes DIS data: 
however, in the latter case the PDF uncertainty automatically increases because of the 
smaller dataset, while when the FFN scheme is adopted an extra theoretical uncertainty 
should be added to the result. It is interesting to observe that Fig. 1131 shows that adopting 
a FFN scheme, or fitting only DIS data makes the NNPDF2.3 results closer to those 
obtained using the ABMll set, which is based on a smaller (mostly DIS) dataset and 
which uses a FFN scheme. Indeed, we have also produced a fit using the FFN scheme to 
DIS data only: the cross-sections we get, also shown in Fig. [131 are in suprisingly good 
agreement with those obtained using the ABMll set. 

In summary, we have studied the impact of three sources of theoretical uncertainties 
on PDF determinations: the use of a FFN scheme, and the inclusion of higher twist 
corrections and deuterium nuclear correction. We conclude that, adopting the dataset, 
methodology, and kinematic cuts of the NNPDF2.3 PDF determination, the impact of 
the FFN is significant, especially at high scales (Q^ ~ ^w)i ^"^^ that it leads to an extra 
uncertainty on the results obtained. Higher twist corrections have by contrast a negligible 
impact. Deuterium nuclear corrections have a moderate impact, up to one sigma, but only 
on the up-down separation in the large x region (x ~ 0.3). Because of the poor theoretical 
knowledge of these effects, this should again be treated as an extra theoretical uncertainty. 
Forthcoming LHC data may help in keeping some of these uncertainties under control, by 
allowing PDF determinations which make no use of data which are subject to nuclear 
corrections [1]. 
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